x-recover 0ยท11

A program to attempt recovery of an x-file's contents

X-Files is an image filing system written by Andy Armstrong. Normal (filecore based) Risc OS filing systems are limited to 10 character leafnames and 77 files per directory. X-Files supports long filenames (up to 256 characters internally) and an unlimited number of files per directory, storing files in an x-file (which in other respects behaves like a "normal" directory).

Unfortunately X-Files is not totally robust, and can occasionally corrupt an x-file, which it will then refuse to open. Sod's law ensures that the data in the x-file is very important and not backed up anywhere else. This is where x-recover comes in; x-recover will attempt to make sense of and extract data from corrupted x-files.

Important: x-recover comes with absolutely NO WARRANTY

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. As x-recover is written in C it almost certainly breaks the rules somewhere, and hence is exhibiting "undefined behaviour". What this means in English is that it could do anything at all, including, but not limited to:

  1. Working exactly as intended
  2. Trashing your hard disc
  3. Turning Barbara Cartland into a goth*

Usage

x-recover [options] <x-file> <destination>

x-file is the pathname of the x-file to attempt to recover. x-recover does not check the filetype (so if x-file is not actually an x-file x-recover will generate copious warnings as it discovers this).

destination is the pathname a directory (or empty x-file) to write recovered contents into. destination should exist, or use the -c option to create a new x-file with this name. destination can be omitted if file output is suppressed (-g or -n options).

Options

-aextract the full chunk allocated to the file, rather than the length used. Using this option causes the chunktable to be written out with a filename of the form file#### [unless file output is suprressed with -s].
-ccreate destination as an x-file if no directory/x-file of this name exists.
-dOutput probable directories in raw binary form. See directories.
-fwrite out any free space between chunks. x-recover -a -d -f -1 will output the entire x-file split into chunks, which if concatenated in numerical order will give the original x-file.[see size]
-gguess the location of the chunktable and the root directory. x-recover simply runs through the file looking areas that resemble the chunktable and the root directory. When used with the -r and -t options this allows recovery even when the file header is corrupted. This option suppresses all disc output.
-nno disc output. Integrity checks and errors are still reported to the screen.
-r <offset>specify the offset of the root directory in x-file. Useful if the header becomes corrupted - see header.
-ssuppress file output, but still create the directories (and free space) in destination. Mostly used for development purposes to quickly check that things are working without having to wait while files are copied, but x-recover -1 -f -s will only extract directories and free space between directories.
-t <offset>specify the offset of the chunktable in x-file. Useful if the header becomes corrupted - see header.
-vverbose info. x-recover prints out more information about what it is doing. Use more vs for more verbosity. [currently most verbose is -vvvvv]
-1Use Method 1 to attempt to recover x-file's contents. The various methods are described below.
-2Use Method 2 to attempt to recover x-file's contents. Method 2 is the current default.

The x-file structure, and how this affects the prognosis

As the x-file file contains within itself data from other files, it must also store information about the contained files and their data. If an x-file becomes corrupted, it is likely that some of this housekeeping information is lost. The x-file structure is described here - which parts survive determines if recovery is possible, and if so the fidelity achievable.
Header
An x-file starts with a header which contains a signature, version information and pointers to the chunktable and root directory. If the information stored in the header becomes corrupted it is possible to search for the chunktable using the -g option.
Chunktable
Except for the header all information in the x-file is stored in chunks. The index containing chunk sizes and positions is stored in the chunktable - clearly if the chunktable is missing then the x-file is simply an amorphous lump of data (like a corrupt hard disc). Currently x-recover doesn't have the knowledge to identify files from inspection of contents, so if the chunktable is missing automated recovery is not possible as x-recover cannot determine where one file stops and the next starts. If the chunktable is reasonably intact then Method 1 can be used to extract the contents of each chunk as separate files, but all name, filetype and datestamp information will be lost.
Root Directory
Information about names, filetypes, attributes and modification time is stored in the root directory and its subdirectories. If the root directory can be found (either from the header or using the -g option) then Method 2 can be used to attempt recovery of the x-file's contents and directory structure, including file type information.

Method 1

Method 1 reads in the chunktable, and then systematically writes out the contents of each non-free chunk as a file the destination directory/x-file. If it suspects that the chunk represents a directory it writes a text file describing the directory's contents, else it copies out the raw contents with filetype Data (FFD). All chunks can be copied out as raw contents with the -d option. Files and directories are named as file0000 file0001 etc in the order that they are found in the x-file body (probably not the same order as chunktable - use -vv (very verbose) or greater to list the chunktable).

Method 2

Method 2 reads in the chunktable and the root directory. Starting in the root directory it attempts to create a list of all files, their filetype, attributes, and location within the x-file (obtained via the chunktable). It recreates this directory structure within the destination directory/x-file, and then copies out all the files it found with their correct filenames, restoring filetypes, access attributes and modification times. It then tries to tally files that it knows exist but has no location with un-recovered chunks from the chunktable, and writes out any successful matches to the destination. Finally it writes the contents of any remaining unrecovered chunks as files in the destination using Method 1. Well, that's the plan...

Directories

Like the normal Risc OS filing system, X-Files writes a special signature at the start of all directories so that when it reads the chunk containing the directory information, it can check that the information is not corrupted. X-Files' signature is the string Andy at the start of the chunk. Hence, if x-recover comes across a chunk about which it has no information, it will have a look for this signature. If present, the chunk is assumed to be a directory, which means that there is a small chance that a file which starts with the text Andy will interpreted as a directory and consequently garbled. If this happens, use the -d option to disable directory identification, and the file contents will be recovered intact.

Size

Size of a file is stored twice in an x-file:

  1. The directory stores the size of a file
  2. The chunktable stores the size of a chunk
As the directory also stores an index into the chunktable, it is possible to have conflicting sizes reported for a given file. In this case x-recover will use the larger of the size in the directory and the size in chunktable. This could cause the recovered file to acquire a copy of the start of the next chunk, which can mean that the sum of the sizes of the recovered parts is greater than the size of the x-file. You don't get any extra information - you just get some of it twice!

Bugs

We don't do bugs.

None known - everything works to the design. However, there are known design deficiencies (and planned improvements). Please report bugs (preferably with fixes) to <bagpuss@done.net>. If you can supply an x-file to demonstrate then this would be useful. Currently I'm quite happy for relevant e-mail up to 1Mb, but if bagpuss.done.net is up then use anonymous ftp to upload problem files to ftp://bagpuss.done.net/

Files up to 100Mb are acceptable by this method. No, I'm not confusing Kb with Mb. If you're on Janet then you should be able to shift 100Mb to me in 7 minutes.

Design deficiencies

  1. Unrecovered chunks are written out as file#### in the destination directory/x-file after files are recovered, overwriting any genuine files with the same name(s). Of course, no-one names files like this...
    (So if you recover one x-file into another rename these files rapidly.)
  2. The file#### naming system assumes that you have less than 10000 chunks. If this is violated x-recover will probably crash from undefined behaviour as several internal buffers overflow. Don't say that I didn't warn you - this software comes with absolutely no warranty.
  3. Method 2 doesn't re-attach subdirectories (and hence cannot recover name/date/attribute information for files contained in these directories). The chunks are recovered by Method 1. I can't just stick them in the destination directory/x-file in case two names clash, as the second will overwrite the first. Method 3 (when written (when someone asks me to)) will hook unclaimed directories into the correct parent (with names like dir_0000 ).
  4. Likewise, Method 2 (and 3) ignore entries in the dirhash that don't correlate with full filenames. Consider what would happen with two files aardvark and aardwolf...

Don't be caught out by

Method 2 restores file permissions (where known). If it restores a file as LR/ (no write access) and an attempt is made to recover the x-file to the same directory, x-recover will not be able to open the file again, so will recover the contents as a Method 1 unrecovered chunk.

Current version

If there is newer version of x-recover it is probably roughly here. If it's missing (or deceased ) e-mail me!


* Black is a much nicer colour than pink


This page was last reviewed on Monday January 13th 1997
Nicholas Clark <Nicholas.Clark@Liverpool.ac.uk>